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BETA 1, 2 - XYLOSYLTRANSFERASE - GENE FROM ARABIDOPSIS 

The invention relates to polynucleotides coding for a pl,2-xylo- 
syl transferase. Furthermore, the invention relates to vectors 
comprising these polynucleotides, recombinant host cells, plants 
and insects transfected with the polynucleotides or with DNA de- 
rived therefrom, respectively, as well as to glycoproteins pro- 
duced in these systems. 

Glycoproteins exhibit a variety and complexity of carbohydrate 
units, the composition and arrangement of the carbohydrates being 
characteristic of different organisms. The oligosaccharide units 
of the glycoproteins have a number of tasks, e.g. they are impor- 
tant in regulating metabolism, they are involved in transmitting 
cell-cell interactions, they determine the circulation periods of 
proteins in circulation, and they are decisive for recognizing 
epitopes in antigen-antibody reactions. 

The glycosylation of glycoproteins starts in the endoplasmatic 
reticulum (ER) , where the oligosaccharides are either bound to 
asparagine side chains by N-glycosidic bonds or to serine or 
threonine side chains by O-glycosidic bonds. The N-bound oligo- 
saccharides contain a common core from a penta-saccharide unit 
which consists of three mannose and two N-acetyl glucose amine 
residues. To modify the carbohydrate units further, the proteins 
are transported from the ER to the Golgi complex. The structure 
of the N-bound oligosaccharide units' of glycoproteins is deter- 
mined by their conformation and by the composition of the glyco- 
syl transferases of the Golgi compartments in which they are 
processed. 

It has been shown that the core pentasaccharide unit of the N- 
glycans of some plants is substituted by pi,2-bound xylose and 
al,3-bound fucose (Lerouge et al . , 1998, Plant Mol. Biol. 38, 31- 
48; Rayon et al., 1998, J. Exp. Bot. 49, 1463-1472). The hepta- 
saccharide "MMXF 3 " constitutes the main oligosaccharide type in 
plants (Kurosaka et al. # 1991, J. Biol. Chem., 266, 4168-4172; 
Wilson and Altmann, 1998, Glycocon j . J. 15, 1055-1070). These 
structures are also termed complex N-glycans or mannose-def icient 
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or truncated N-glycans, respectively. The a-mannosyl residues may 
be further replaced by GlcNAc, to which galactose and fucose are 
bound so that a structure is prepared which corresponds to the 
human Lewis a-epitope (Melo et al., 1997, FEBS Lett 415, 186-191; 
Fitchette-Laine et al., 1997, Plant J. 12, 1411-1417). 

Neither pl,2-xylose nor the al,3-bound fucose exist in mammalian 
glycoproteins. It has been found that the pi,2-xylose together 
with ocl f 3-fucose plays an important role in the epitope recogni- 
tion of antibodies which are directed against plant N-bound oli- 
gosaccharides, and thereby trigger immune reactions in human or 
animal bodies against these oligosaccharides (Faye et al., 1993, 
Anal. Biochem. 209, 104-108). The 01,2-xylose and/or al,3-fucose 
containing N-glycans furthermore seem to be one of the main 
causes for the wide-spread allergic cross reactivity between 
various plant and insect allergens and is also termed "cross-re- 
active carbohydrate determinant" (CCD). Due to the frequent oc- 
currence of immunological cross reactions, the CCDs moreover mask 
allergy diagnoses . 

The immunological reactions triggered in the human body by plant 
proteins are the main problem in the medicinal use of recombinant 
human proteins produced in plants. To circumvent this problem, 
pl,2-xylosylation together with <xl,3-fucosylation would have to 
be prevented. According to a study, a mutant of the plant Arabi- 
dopsis thaliana was isolated in which the activity of N-acetyl- 
glucosaminyl transferase I, the first enzyme in the biosynthesis 
of complex glycans, is missing. The biosynthesis of the complex 
glycoproteins in this mutant thus is disturbed. Nevertheless, 
these mutant plants are capable of developing normally under cer- 
tain conditions (A. Schaewen et al, 1993, Plant Physiol. 102; 
1109-1118) . 

To block specifically the transfer of the pl,2-xylose to an oli- 
gosaccharide without also interfering in other glycosylation 
steps, solely that enzyme would have to be inactivated which is 
directly responsible for this specific glycosylation, i.e. the 
pi, 2-xylosyl transferase. This transferase which only occurs in 
plants and some non-vertebrate animal species, e.g. in Schisto- 
soma sp. (Khoo et al., 1997, Glycobiology 7, 663-677) and snail 
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(e.g. Mulder et al., 1995, Eur. J. Biochem. 232, 272-283), yet 
not in human beings or in other vertebrates, would have to be in- 
activated on purpose or suppressed so that human proteins which 
are produced in plants or in plant cells, respectively, would no 
longer contain this immune-reaction-triggering epitope, as has 
been the case so far. 

Pl,2-xylosyl transferase transfers the D-xylose from UDP-xylose to 
the beta-linked mannose of plant N-linked oligosaccharides. 

This enzyme was purified from soybean microsomes in 1997; Zeng et 
al.: J. Biol. Chem., 272, 31340-31347, 1997). According to this 
article, the best acceptor for xylose transfer was 
GlcNAc 2 Man 3 GlcNAc 2 -T, but GlcNACiMan 3 GlcNAc 2 , with the GLcNAc on 
the 3 -branch, was also a good acceptor. Furthermore, a number of 
other N-linked oligosaccharides were poor acceptors, especially 
those with galactose units at the nonreducing termini. 

In the article by Rayon et al. (Plant Physiology, 1999, 119, 725- 
733) it is mentioned that Arabidopsis proteins are N-glycosylated 
by high-mannose-type N-glycans and by xylose- and fucose contain- 
ing oligosaccharides. TEZUKA et al. (Eur. J. Biochem. 203, 401- 
413 (1992)) measured the activities of different enzymes, for ex- 
ample pi, 2 -xylosyl transferase in the Golgi fraction of suspen- 
sion-cultured cells of sycamore. They demonstrated that xylose 
was transferred onto the inner mannose by pi, 2-xylosyl transfe- 
rase. Furthermore, they mentioned that xylose containing oligo- 
saccharides are widely distributed throughout the plant kingdom 
although xylose containing N-linked oligosaccharides were also 
found in glycoproteins from gastropods and Chlorophyceae . 

For the specific supression or inactivation of proteins it is 
best to carry this out at the level of transcription and trans- 
lation step, respectively. For this it is necessary to isolate 
and sequence the nucleotide sequence which codes for the active 
protein. 

As mentioned above, the soybean Pi, 2 -xylosyl transferase was iso- 
lated and purified in 1997. Only a part of the xylosyltransferase 
cDNA has been isolated, s. W099/29835 Al; SEQ ID NO 6 and 7) , 
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however, the complete cDNA, which codes for the active protein 
could not be isolated and characterized so far. The reason why 
the nucleotide sequence has not been identified so far could be 
major problems in the procedure due to very low abundance of the 
mRNA which codes for the xylosyl transferase in the organisms, as 
for example soybeans. Until now, although several groups have 
tried to identify the entire nucleotide sequence of this gene, 
usually from soybeans, it was not possible to produce full length 
cDNA which corresponds to the xylosyl transferase mRNA with the 
help of conventional methods known to be effective in usual 
cases, for example with the help of the RACE-amplif ication (rapid 
amplification of cDNA ends) . With this method unknown sequences 
are amplified with the help of specific amplification primers. 
Potential reasons for unsuccessful 5 ' -RACE experiments can be an 
inadequate choice of specific PCR primers as well as the presence 
of reverse transcriptase-inhibiting components during cDNA syn- 
thesis. 

One problem of the isolated soybean 01, 2-xylosyl transferase is 
that its solubility and activity depends on the presence of de- 
tergents . 

Additionally to the problem of the extremely low concentration of 
Pl,2-xylosyl transferase mRNA there is furthermore the problem 
that the secondary structure at the 5 '-end of the RNA seems to 
hinder the amplification of this region, in these cases, the RACE 
amplification which is in itself a sensitive method, does not re- 
sult in the correct and complete xylosyl trans ferase-cDNA se- 
quence. 

It is of course also very likely that the mRNA and the cDNA de- 
rived thereof, beside the fact that it is present only in very 
low concentrations, recombines and. mutates very easily in the 
course of the various manipulations. For these and other poten- 
tial reasons the cloning and the expression of this specific gene 
was impossible until now. 

It is an object of the present invention to clone and to sequence 
the whole gene which codes for a plant pi, 2-xylosyl transferase, 
and to prepare vectors comprising this gene or an altered DNA or 
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a DNA derived therefrom, to transfect plants as well as cells 
thereof with one of these vectors, to produce glycoproteins that 
do not contain the normally occurring Pl,2-xylose, as well as to 
provide corresponding methods therefor. 

A further object is the production of large quantities of puri- 
fied recombinant enzyme in order to allow in vitro synthesis of 
homogenous N-glycans or glycoconjugates containing pi,2-xylose. 
This will aid the further elucidation of the role of pi, 2-xylose 
in the immunogenicity and allergenicity of plant glycoproteins. 

The object according to the invention is achieved by a DNA mole 
cule comprising a sequence according to SEQ ID NO: 8 with an open 
reading frame from base pair 227 to base pair 1831 or being at 
least 50% homologous to the above sequence or hybridizing with 
the above-indicated sequence under stringent conditions, or com- 
prising a sequence which has degenerated to the above DNA se- 
quence due to the genetic code, the sequence coding for a plant 
protein which has pi, 2-xylosyl transferase activity or is comple- 
mentary thereto. This complete sequence which has never been de- 
scribed is particularly useful for experiments, analysis and pro- 
duction processes which concern the Pi, 2-xylosyl transferase acti- 
vity. This sequence can be used especially for the inactivation 
or suppression of the pi, 2-xylosyl transferase as well as for 
overexpression and production of the recombinant enzyme. 

Upon searching GenBank+EMBL+DDBJ+PDB databases using the soybean 
xylosyltransferase-derived peptides as mentioned above several 
polypeptide sequences (from Arabidopsis and Drosophila) with sig- 
nificant homologies were retrieved. However, these sequences were 
otherwise unrelated to each other as well as to the pi, 2-xylosyl - 
transferase sequence finally identified. Successful retrieval of 
candidate sequence for pi, 2-xylosyltransf erase was possible only 
by properly assembling three soybean Pi, 2-xylosyl transf erase-de- 
rived peptide sequences to a single sequence. All search strate- 
gies we used according to the present state of the art (i.e. 
using the peptide sequences separately or in combination with 
each other) did not lead to a successful retrieval of the correct 
sequence for pi , 2-xylosyltransf erase . 
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The isolation and purification of this gene was achieved by 
searching in the DDB J +GenBank+EMBL+ PDB-dat abas es corresponding 
sequences to three known peptides (used as assembled peptides) of 
the soybean xylosyl transferase (Patent W099/29835 Al, SEQ ID NO: 
3 and 5) . It was found that one DNA sequence of Arabidopsis 
thaliana which has not yet been assigned to any protein before 
showed homology to two of the three peptides. With the help of 
the gene-finder program a predicted protein sequence was found 
according to which sequence specific primers for a RT-PCR were 
designed. It was possible to produce a first strand cDNA corres- 
ponding to the mRNA of the A. thai iana-xylosyl trans f erase gene 
after which the first strand cDNA was subjected to a PCR using 
the specifically designed primers. The reason for the successful 
production of A. thaliana xylosyl trans f eras e-cDNA may be on the 
one hand that the xylosyltransf erase-mRNA of A. thaliana is less 
problematic compared to other plant species, on the other hand 
the PCR was performed with optimally designed gene-specific 
primers. 

The open reading frame of the SEQ ID NO: 8 codes for a protein 
with 534 amino acids and with a theoretical molecular weight of 
60.2 kDa, a transmembrane portion presumably being present in the 
region between Hell and Phe29. The calculated pi value of the 
encoded protein of the sequence according to SEQ ID NO: 9 is 
7.52. 

The activity of the plant 01, 2-xylosyl transferase is detected by 
a method and measured, the xylosyltransf erase being added to a 
sample containing UDP-xylose and a labelled acceptor (e.g. a gly- 
copeptide or labelled oligosaccharide) . After the reaction time, 
the content of bound xylose is measured. The activity of the xy- 
losyltransf erase in this case is seen as positive if the activity 
measurement is higher by at least 10 to 20%, in particular at 
least 30 to 50%, than the activity measurement of the negative 
control. The structure of the oligosaccharide may additionally be 
verified by means of HPLC. Such protocols are prior art 
(Staudacher et al . , 1998, Anal. Biochem. 246, 96-101; Staudacher 
et al., 1991, Eur. J. Biochem. 199, 745-751). Whether the xylose 
is bound or not to the acceptor substrate can furthermore be de- 
termined by measuring the mass of the product by means of mass 
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spectrometry. 

The pairing of two DNA molecules can be changed by selection of 
the temperature and ionic strength of the sample. By stringent 
conditions, according to the invention conditions are understood 
which allow for an exact, stringent, binding. For instance, the 
DNA molecules are hybridized in 7% sodium dodecyl sulfate (SDS), 

0. 5. NaP04, pH 7.0, ImM EDTA at 50°C, and washed with 1% SDS at 
42°C. 

Whether sequences have an at least 50% homology to SEQ ID NO: 8 
can be determined e.g. by means of the program FastDB of EMBL or 
SWISSPROT data bank. 

There exist a number of relevant differences between recombinant 
Pi, 2 -xylosyl transferase encoded by the DNA molecule of the pre- 
sent invention and the respective enzyme from soybean as descri- 
bed in W099/29835 Al: 

1. The recombinant enzyme is soluble without detergents (e.g. 
Triton X-100), whereas the solubility of the enzyme from soybean 
depends on the presence of detergents. 

2. The recombinant enzyme is fully active in the absence of de- 
tergents (e.g. Triton X-100) . 

3. The recombinant enzyme is N-glycosylated, whereas the enzyme 
from soybean is described to be unglycosylated. 

4. The enzyme from A. thaliana exhibits full enzymatic activity 
also as a truncated form lacking the 32 N-terminal amino acids. 

5. In contrast to the enzyme from soybean the enzyme from A. 
thaliana has a broad pH-optimum and shows pronounced activity in 
the range of pH 6 - 8. 

6. The cDNA sequence coding for the soybean enzyme corresponds 
only to amino acids (aa) 199 - 469 of the A. thaliana protein, 
see figure 11. 
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7. The cDNA sequence coding for A. thaliana xylosyl transferase 
contains two insertions (corresponding to aa 375-382 and aa 425- 
429 of the predicted protein sequence) compared to the partial 
sequence of the soybean enzyme, see figure 11. 

8. None of the five peptides (see figure 4 of W099/29835 Al) iso- 
lated from the soybean enzyme is identical to the corresponding 
regions of the enzyme from A. thaliana: 

Peptide SEQ ID NO. 1: homologous to aa 411-422 of the A. thaliana 

enzyme 

Peptide SEQ ID NO. 2: homologous to aa 192-205 of the A. thaliana 

enzyme 

Peptide SEQ ID NO. 3: homologous to aa 451-477 of the A. thaliana 

enzyme 

Peptide SEQ ID NO. 4: homologous to aa 191-205 of the A. thaliana 

enzyme 

Peptide SEQ ID NO. 5: homologous to aa 503-512 of the A. thaliana 

enzyme (remark: the cDNA sequence listed in 
W099/29835 Al does not contain a coding 
sequence for peptide 5) . 

Therefore the DNA molecule according to the present invention is 
particularly advantageous since it encodes for an active recombi- 
nant enzyme which shows surprisingly advantageous characteristics 
and effects over the known purified enzyme. 

Preferably, the sequence of the DNA molecule of the invention en- 
codes a protein with a pi , 2 -xylosyl transferase activity. This 
specific protein is especially useful for analysis, experiments 
and production methods which relate to the pi, 2-xylosyl transfe- 
rase. 

Preferably, the DNA molecule according to the invention is at 
least 70%, preferably at least 80%, particularly preferred at 
least 95%, homologous with the sequence according to SEQ ID NO: 
8. This sequence codes for a particularly active pi, 2 -xylosyl - 
transferase. The homology preferably is determined with a program 
which recognizes insertions and deletions and which does not con- 
sider these in the homology calculation. 
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According to a further advantageous embodiment, the DNA molecule 
comprises 1750 to 1850, in particular 1831, base pairs. 

In doing so, it is particularly advantageous if one of the above- 
indicated DNA molecules is covalently associated with a detect- 
able marker substance. As the marker (labelling) substance, any 
common marker can be used, such as, e.g., fluorescent, lumines- 
cent, radioactive markers, biotin, etc. In this manner, reagents 
are provided which are suitable for the detection, selection and 
quantitation of corresponding DNA molecules in solid tissue sam- 
ples (e.g. from plants) or also in liquid samples, by means of 
hybridizing methods. 

Preferably, the DNA molecule according to the invention includes 
a sequence which comprises a deletion, insertion and/or substitu- 
tion mutation. The number of mutant nucleotides is variable and 
varies from a single one to several deleted, inserted or substi- 
tuted nucleotides . It is also possible that the reading frame is 
shifted by the mutation. In such a "knock-out gene" it is merely 
important that the expression of a pi, 2-xylosyl transferase is 
disturbed, and the formation of an active, functional enzyme is 
prevented. In doing so, the site of the mutation is variable, as 
long as expression of an enzymatically active protein is preven- 
ted. Preferably, the mutation in the catalytic region of the en- 
zyme which is located in the C-terminal region. The method of in- 
serting mutations in DNA sequences are well known to the skilled 
artisan, and therefore the various possibilities of mutageneses 
need not be discussed here in detail. Coincidental mutageneses as 
well as, in particular, directed mutageneses, e.g. the site-di- 
rected mutagenesis, oligonucleotide-controlled mutagenesis or mu- 
tageneses by aid of restriction enzymes may be employed in this 
instance. 

The invention further provides a DNA molecule which codes for a 
ribozyme which comprises two sequence sections, each of which has 
a length of at least 10 to 15 base pairs each, which are comple- 
mentary to sequence sections of an inventive DNA molecule as de- 
scribed above so that the ribozyme complexes and cleaves the mRNA 
which is transcribed by a natural pi, 2-xylosyl transferase DNA mo- 
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lecule. The publication by John M. Burke "Clearing the way for 
ribozymes" (Nature Biotechnology 15:414-415; 1997) relates to the 
general mode of function of ribozymes. The ribozyme will recog- 
nize the mRNA of the |3l,2-xylosyl transferase by complementary 
base pairing with the mRNA. Subsequently, the ribozyme will 
cleave and destroy the RNA in a sequence-specific manner, before 
the enzyme is translated. After dissociation from the cleaved 
substrate, the ribozyme will repeatedly hybridize with RNA mole- 
cules and act as specific endonuclease . In general, ribozymes may 
specifically be produced for inactivation of a certain mRNA, even 
if not the entire DNA sequence which codes for the protein is 
known. Ribozymes are particularly efficient if the ribosomes move 
slowly along the mRNA. In that case it is easier for the ribozyme 
to find a ribosome-free site on the mRNA. For this reason, slow 
ribosome mutants are also suitable as a system for ribozymes (J. 
Burke, 1997, Nature Biotechnology; 15, 414-415) . 

One possible way is also to use a varied form of a ribozmye, i.e. 
a minizyme. Minizymes are efficient particularly for cleaving 
larger mRNA molecules . A minizyme is a hammer head ribozyme which 
has a short oligonucleotide linker instead of the stem/loop II. 
Dimer-minizymes are particularly efficient (Kuwabara et al., 
1998, Nature Biotechnology, 16; 961-965). 

A further aspect of the invention relates to a biologically func- 
tional vector which comprises one of the above- indicated DNA mo- 
lecules. For transfection into host cells, an independent vector 
capable of amplification is necessary, wherein, depending on the 
host cell, transfection mechanism, task and size of the DNA mole- 
cule, a suitable vector can be used. Since a large number of dif- 
ferent vectors is known, an enumeration thereof would go beyond 
the limits of the present application and there fore is done 
without here, particularly since the vectors are very well known 
to the skilled artisan (as regards the vectors as well as all the 
techniques and terms used in this specification which are known 
to the skilled artisan, cf. also Maniatis) . Ideally, the vector 
has a small molecule mass and should comprise selectable genes so 
as to lead to an easily recognizable phenotype in a cell so thus 
enable an easy selection of vector-containing and vector-free 
host cells. To obtain a high yield of DNA and corresponding gene 
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products, the vector should comprise a strong promoter, as well 
as an enhancer, gene amplification signals and regulator sequen- 
ces. For an autonomous replication of the vector, furthermore, a 
replication origin is important. Polyadenylation sites are re- 
sponsible for correct processing of the mRNA and splice signals 
for the RNA transcripts. If phages, viruses or virus particles 
are used as the vectors, packaging signals will control the pack- 
aging of the vector DNA. For instance, for transcription in 
plants, Ti plasmids are suitable, and for transcription in insect 
cells, baculo viruses, and in insects, respectively, transposons, 
such as the P element. 

If the above-described inventive vector is inserted into a plant 
or into a plant cell, a post-transcriptional suppression of the 
gene expression of the endogenous |}l, 2-xylosyltransf erase gene is 
attained by transcription of a transgene homologous thereto or of 
parts thereof, in sense orientation. For this sense technique, 
furthermore, reference is made to the publications by Baucombe 
1996, Plant. Mol . Biol., 9:373-382, and Brigneti et al., 1998, 
EMBO J. 17:6739-6746. This strategy of "gene silencing" is an ef- 
fective way of suppressing the expression of the pi, 2-xylosyl- 
transf erase gene, cf. also Waterhouse et al., 1998, Proc. Natl. 
Acad. Sci. USA, 95:13959-13964. 

Furthermore, the invention relates to a biologically functional 
vector comprising a DNA molecule according to one of the above- 
described embodiments, being inversely orientated with respect to 
the promoter. If this vector is trans fee ted in a host cell, an 
"antisense mRNA" will be read which is complementary to the mRNA 
of the pi, 2-xylosyltransf erase and complexes the latter. This 
bond will either hinder correct processing, transportation, sta- 
bility or, by preventing ribosome annealing, it will hinder 
translation and thus the normal gene expression of the pi, 2-xylo- 
syltransf erase . 

Although the entire sequence of the DNA molecule could be inser- 
ted into the vector, partial sequences thereof because of their 
smaller size may be advantageous for certain purposes. With the 
antisense aspect, e.g., it is important that the DNA molecule is 
large enough to form a sufficiently large antisense mRNA which 
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will bind to the transferase mRNA. A suitable antisense RNA mole- 
cule comprises, e.g., from 50 to 200 nucleotides since many of 
the known, naturally occurring antisense RNA molecules comprise 
approximately 100 nucleotides . 

For a particularly effective inhibition of the expression of an 
active pi, 2 -xylosyl transferase, a combination of the sense tech- 
nique and the antisense technique is suitable (Waterhouse et al., 
1998, Proc. Natl. Acad. Sci., USA, 95:13959-13964). 

Advantageously, rapidly hybridizing RNA molecules are used. The 
efficiency of antisense RNA molecules which have a size of more 
than 50 nucleotides will depend on the annealing kinetics in vi- 
tro. Thus, e.g., rapidly annealing antisense RNA molecules exhi- 
bit a greater inhibition of protein expression than slowly hybri- 
dizing RNA molecules (Wagner et al., 1994, Annu. Rev. Microbiol., 
48:713-742; Rittner et al., 1993, Nucl. Acids Res., 21: 1381- 
1387). Such rapidly hybridizing antisense RNA molecules particu- 
larly comprise a large number of external bases (free ends and 
connecting sequences), a large number of structural subdomains 
(components) as well as a low degree of loops (Patzel et al. 
1998; Nature Biotechnology, 16; 64-68) . The hypothetical secon- 
dary structures of the antisense RNA molecule may, e.g., be de- 
termined by aid of a computer program, according to which a 
suitable antisense RNA DNA sequence is chosen. 

Different sequence regions of the DNA molecule may be inserted 
into the vector. One possibility consists, e.g., in inserting 
into the vector only that part which is responsible for ribosome 
annealing. Blocking in this region of the mRNA will suffice to 
stop the entire translation, a particularly high efficiency of 
the antisense molecules also results for the 5'- and 3 '-non 
translated regions of the gene. 

The invention also relates to a biologically functional vector 
which comprises one of the two last-mentioned DNA molecules (mu- 
tation or ribozyme-DNA molecule) . What has been said above re- 
garding vectors also applies in this instance. 

According to the invention, there is provided a method of prepa- 
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ring a cDNA comprising the DNA molecule of the invention, where 
in RNA is isolated from a plant cell, in particular from leaf 
cells, by means of which a reverse transcription is carried out 
after the addition of a reverse transcriptase and primers. The 
individual steps of this method are carried out according to pro- 
tocols known per se. For the reverse transcription, on the one 
hand, it is possible to produce the cDNA of the entire mRNA with 
the help of oligo(dT) primers, and only then to carry out a PCR 
by means of selected primers so as to prepare DNA molecules com- 
prising the Pi, 2 -xylosyl transferase gene. On the other hand, the 
selected primers may directly be used for the reverse transcrip- 
tion so as to obtain short, specific cDNA. The suitable primers 
may be prepared e.g. synthetically according to the pattern of 
cDNA sequences of the transferase. 

The invention furthermore relates to a method of cloning a pi, 2- 
xylosyl transferase, characterized in that the DNA molecule of the 
invention is cloned into a vector which subsequently is transfec- 
ted into a host cell or host, respectively, wherein, by selection 
and amplification of transfected host cells, cell lines are ob- 
tained which express the active pi, 2 -xylosyl transferase. The DNA 
molecule is inserted into the vector by aid of restriction endo- 
nucleases, e.g.. For the vector, there applies what has already 
been said above. What is important in this method is that an ef- 
ficient host-vector system is chosen. To obtain an active enzyme, 
eukaryotic host cells are particularly suitable. One possible way 
is to transfect the vector in insect cells. In doing so, in par- 
ticular an insect virus would have to be used as vector, such as, 
e.g., baculovirus . 

Of course, plants or plant cells, human or other vertebrate cells 
can also be transfected, in which case the latter would express 
an enzyme foreign to them. 

Preferably, a method of preparing recombinant host cells, in par- 
ticular plant cells or plants, respectively, with a suppressed or 
completely stopped pi, 2-xylosyl transferase production is provi- 
ded, which is characterized in that at least one of the vectors 
according to the invention, i.e. that one comprising the inven- 
tive DNA molecule, the mutant DNA molecule or the DNA molecule 
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coding for ribozymes or the one comprising the DNA molecule in 
inverse orientation to the promoter, is inserted into the host 
cell or plant, respectively. What has been said above for the 
transfection also is applicable in this case. 

As the host cells, plant cells may, e.g., be used, wherein, e.g., 
the Ti plasmid with the agrobacterium system is eligible. With 
the agrobacterium system it is possible to transfect a plant di- 
rectly: agrobacteria cause root stem galls in plants. If agrobac- 
teria infect an injured plant, the bacteria themselves do not get 
into the plant, but they insert the recombinant DNA portion, the 
so-called T-DNA, from the annular, extrachromosomal , tumour-indu- 
cing Ti -plasmid into the plant cells. The T-DNA, and thus also 
the DNA molecule inserted therein, are installed in the chromoso- 
mal DNA of the cell in a stable manner so that the genes of the 
T-DNA will be expressed in the plant. 

There exist numerous known, efficient transfection mechanisms for 
different host systems. Some examples are electroporation, the 
calcium phosphate method, microinjection, liposome method. 

Subsequently, the transfected cells are selected, e.g. on the ba- 
sis of antibiotic resistences for which the vector comprises 
genes, or other marker genes. Then the transfected cell lines are 
amplified, either in small amounts, e.g. in Petri dishes, or in 
large amounts, e.g. in fermentors. Furthermore, plants have a 
particular characteristic, i.e. they are capable to re-develop 
from one (transfected) cell or from a protoplast, respectively, 
to a complete plant which can be grown. 

Depending on the vector used, processes will occur in the host so 
that the enzyme expression will be suppressed or completely 
blocked: 

If the vector comprising the DNA molecule with the deletion, in- 
sertion or substitution mutation is transfected, a homologous re- 
combination will occur: the mutant DNA molecule will recognize 
the identical sequence in the genome of the host cell despite its 
mutation and will be inserted exactly on that place so that a 
"knock-out gene" is formed. In this manner, a mutation is intro- 



WO 01/64901 

PCT/EP01/02352 

- 15 - 

duced into the gene for the Pi, 2 -xylosyl transferase which is ca- 
pable of inhibiting the faultless expression of the pi, 2-xylosyl 
transferase. As has been explained above, with this technique it 
is important that the mutation suffices to block the expression 
of the active protein. After selection and amplification, the 
gene may be sequenced as an additional check so as to determine 
the success of the homologous recombination or the degree of mu- 
tation, respectively. 

If the vector comprising the DNA molecule coding for a ribozyme 
is transfected, the active ribozyme will be expressed in the host 
cell. The ribozyme complexes the complementary mRNA sequence of 
the pi, 2-xylosyl transferase at least at a certain site, cleaves 
this site, and in this manner it can inhibit the translation of 
the enzyme. In this host cell as well as in cell lines, or op- 
tionally, plant, respectively, derived therefrom, pi , 2 -xylosyl - 
transferase will not be expressed. 

in case the vector comprises the inventive DNA molecule in sense 
or inverse direction to the promoter, a sense or antisense-mRNA 
will be expressed in the transfected cell (or plant, respective- 
ly) . The antisense mRNA is complementary at least to a part of 
the mRNA sequence of the Pi, 2-xylosyl transferase and may likewise 
inhibit translation of the enzyme. As an example of a method of 
suppressing the expression of a gene by antisense technique, re- 
ference is made to the publication by Smith et al., 1990, Mol. 
Gen. Genet. 224:477-481, wherein in this publication the'expres- 
sion of a gene involved in the maturing process of tomatoes is 
inhibited. Double-stranded RNA (dsRNA) has recently been shown to 
trigger sequence-specific gene silencing in a wide variety of or- 
ganisms, including nematodes, plants, trypanosomes , fruit flies 
and planaria; an as yet uncharacterized RNA trigger has been 
shwon to induce DNA methylation in several different plant sys- 
tems leading to selective interference with gene function (for 
review see Fire A., 1999, Trends Genet 15 (9): 358-363). 

In all the systems, expression of the pi, 2-xylosyl transferase is 
at least suppressed, preferably even completely blocked. The de- 
gree of the disturbance of the gene expression will depend on the 
degree of complexing, homologous recombination, on possible sub- 
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sequent coincidental mutations and on other processes in the re- 
gion of the genome. The transfected cells are checked for pi, 2- 
xylosyl transferase activity and selected. 

Moreover, it is possible to still further increase the above-de- 
scribed suppression of the expression of the pi , 2 -xylosyl transfe- 
rase by introducing into the host a vector comprising a gene 
coding for a mammalian protein, e.g. pi, 4-galactosyltransf erase, 
in addition to the insertion of an above-described vector. Xylo- 
sylation may be reduced by the action of other mammalian enzymes, 
the combination of the inhibition of the expression of an active 
pi, 2 -xylosyl transferase by means of the inventive vector and by 
means of a mammalian enzyme vector being particularly efficient. 

Any type of plant may be used for transf ection, e.g. mung bean, 
tobacco plant, tomato and/or potato plant. 

Another advantageous method of producing recombinant host cells, 
in particular plant cells, or plants, respectively, consists in 
that the DNA molecule comprising the mutation is inserted into 
the genome of the host cell, or plant, respectively, in the place 
of the non-mutated homologous sequence (Schaefer et al., 1997, 
Plant J.; 11(6) :1195-1206) . This method thus does not function 
with a vector, but with a pure DNA molecule. The DNA molecule is 
inserted into the host e.g. by gene bombardment, microinjection 
or electroporation, to mention just three examples. As has al- 
ready been explained, the DNA molecule binds to the homologous 
sequence in the genome of the host so that a homologous recombi- 
nation and thus reception of the deletion, insertion or substitu- 
tion mutation, respectively, will result in the genome: Expres- 
sion of the pi, 2 -xylosyl transf erase can be suppressed or com- 
pletely blocked, respectively. 

Preferably, recombinant plants or plant cells, respectively, are 
provided which have been prepared by one of the methods described 
above, their pi, 2 -xylosyl transf erase production being suppressed 
or completely blocked, respectively . Preferably, their pl,2-xylo- 
syl transf erase activity is less than 50%, in particular less than 
20%, particularly preferred 0%, of the pi, 2-xylosyl transf erase 
activity occurring in natural plants or plant cells, respective- 
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ly. The advantage of these plants or plant cells, respectively, 
is that the glycoproteins produced by them do not comprise any or 
hardly comprise any pi,2-bound xylose. If products of these 
plants are taken up by human or vertebrate bodies, there will be 
no immune reaction due to the pl,2-xylose epitope. 

The invention also relates to a PNA molecule comprising a base 
sequence complementary to the sequence of the DNA molecule accor- 
ding to the invention. PNA (peptide nucleic acid) is a DNA-like 
sequence, the nucleobases being bound to a pseudo-peptide back- 
bone. PNA generally hybridizes with complementary DNA-, RNA- or 
PNA-oligomers by Watson-Crick base pairing and helix formation. 
The peptide backbone ensures a greater resistance to enzymatic 
degradation. The PNA molecule thus is an improved antisense 
agent . 

Neither nucleases nor proteases are capable of attacking a PNA 
molecule. The stability of the PNA molecule, if bound to a com- 
plementary sequence, comprises a sufficient steric blocking of 
DNA and RNA polymerases, reverse transcriptase, telomerase and 
ribosomes. The publication by Pooga et al., "Cell penetrating PNA 
constructs regulate galanin receptor levels and modify pain 
transmission in vivo" (Nature Biotechnology 16:857-861; 1998) re- 
lates to PNA molecules in general and specifically to a PNA mole- 
cule that is complementary to human galanin receptor type 1 mRNA. 

If the PNA molecule comprises the above-mentioned sequence, it 
will bind to the DNA or to a site of the DNA, respectively,' which 
codes for 01, 2-xylosyltransf erase and in this way is capable of 
inhibiting transcription of this enzyme. As it is neither tran- 
scribed nor translated, the PNA molecule will be prepared syn- 
thetically, e.g. by aid of the the t-Boc technique. 

Advantageously, a PNA molecule is provided which comprises a base 
sequence which corresponds to the sequence of the inventive DNA 
molecule. This PNA molecule will complex the mRNA or a site of 
the mRNA of pi, 2-xylosyltransf erase so that the translation of 
the enzyme will be inhibited. Similar arguments as set forth for 
the antisense RNA apply in this case. Thus, e.g., a particularly 
efficient complexing region is the translation start region or 
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also the 5 ' -non-translated regions of mRNA. 

A further aspect of the present invention relates to a method of 
preparing plants or plant cells, respectively, in particular 
plant cells which comprise a blocked expression of the pl,2-xylo- 
syl transferase at the transcription or translation level, respec- 
tively, which is characterized in that inventive PNA molecules 
are inserted in the cells. To insert the PNA molecule or the PNA 
molecules, respectively, in the cell, again conventional methods, 
such as, e.g., electroporation or microinjection, are used. Par- 
ticularly efficient is insertion if the PNA oligomers are bound 
to cell penetration peptides, e.g. transportan or pAntp (Pooga et 
al., 1998, Nature Biotechnology, 16; 857-861). 

The invention provides a method of preparing recombinant glyco- 
proteins which is characterized in that the inventive, recombi- 
nant plants or plant cells, respectively, whose pi, 2-xylosyl- 
transferase production is suppressed or completely blocked, re- 
spectively, or plants or cells, respectively, in which the PNA 
molecules have been inserted according to the method of the in- 
vention, are transfected with the gene that expresses the glyco- 
protein so that the recombinant glycoproteins are expressed. In 
doing so, as has already been described above, vectors comprising 
genes for the desired proteins are transfected into the host or 
host cells, respectively, as has also already been described 
above. The transfected plant cells will express the desired pro- 
teins, and they have no or hardly any pi, 2 -bound xylose. Thus, 
they do not trigger the immune reactions already mentioned above 
in the human or vertebrate body. Any proteins may be produced in 
these systems. 

Advantageously, a method of preparing recombinant human glycopro- 
teins is provided which is characterized in that the recombinant 
plants or plant cells, respectively, whose pi, 2-xylosyl transfe- 
rase production is suppressed or completely blocked, or plants or 
cells, respectively, in which PNA molecules have been inserted 
according to the method of the invention, are transfected with 
the gene that expresses the glycoprotein so that the recombinant 
glycoproteins are expressed. By this method it becomes possible 
to produce human proteins in plants (plant cells) which, if taken 
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up by the human body, do not trigger any immune reaction directed 
against Pi, 2 -bound xylose residues. There, it is possible to 
utilize plant types for producing the recombinant glycoproteins 
which serve as food stuffs, e.g. banana, potato and/or tomato. 
The tissues of this plant comprise the recombinant glycoprotein 
so that, e.g. by extraction of the recombinant glycoprotein from 
the tissue and subsequent administration, or directly by eating 
the plant tissue, respectively, the recombinant glycoprotein is 
taken up in the human body. 

Preferably, a method of preparing recombinant human glycoprote- 
ins for medical use is provided, wherein the inventive, recombi- 
nant plants or plant cells, respectively, whose pi, 2-xylosyl- 
transferase production is suppressed or completely blocked, re- 
spectively, or plants or cells, respectively, into which the PNA 
molecules have been inserted according to the method of the in- 
vention, are transfected with the gene that expresses the glyco- 
protein so that the recombinant glycoproteins are expressed. In 
doing so, any protein can be used which is of medical interest. 

Moreover, the present invention relates to recombinant glycopro- 
teins according to a method described above, wherein they have 
been prepared in plant systems and wherein their peptide sequence 
comprises less than 50%, in particular less than 20%, particular- 
ly preferred 0%, of the pi,2-bound xylose residues occurring in 
proteins expressed in non-xylosyl trans f eras e-r educed plant sys- 
tems. Naturally, glycoproteins which do not comprise pl,2-bound 
xylose residues are to be preferred. The amount of pl,2-bound xy- 
lose will depend on the degree of the above-described suppression 
of the pi, 2 -xylosyl transferase. 

Preferably, the invention relates to recombinant human glycopro- 
teins which have been produced in plant systems according to a 
method described above and whose peptide sequence comprises less 
than 50%, in particular less than 20%, particularly preferred 0%, 
of the pi,2-bound xylose residues occurring in the proteins ex- 
pressed in non-xylosyltransferase-reduced plant systems. 

A particularly preferred embodiment relates to recombinant human 
glycoproteins for medical use which have been prepared in plant 
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systems according to a method described above and whose peptide 
sequence comprises less than 50%, in particular less than 20%, 
particularly preferred 0%, of the 01,2-bound xylose residues oc- 
curring in the proteins expressed in non-xylosyl trans ferase-re- 
duced plant systems . 

The glycoproteins according to the invention may include other 
bound oligosaccharide units specific for plants, whereby - in the 
case of human glycoproteins - they differ from these natural gly- 
coproteins. Nevertheless, by the glycoproteins according to the 
invention, a slighter immune reaction or no immune reaction at 
all, respectively, is triggered in the human body, since, as has 
already been explained in the introductory portion of the speci- 
fication, the 01,2-bound xylose residues, together with crt,3-fu- 
cose residues, are the main cause for the immune reactions or 
cross immune reaction, respectively, to plant glycoproteins. 

A further aspect comprises a pharmaceutical composition compris- 
ing the glycoproteins according to the invention, in addition to 
the glycoproteins of the invention, the pharmaceutical composi- 
tion comprises further additions common for such compositions. 
These are, e.g., suitable diluting agents of various buffer con- 
tents (e.g. Tris-HCl, acetate, phosphate, pH and ionic strength), 
additives, such as tensides and solubilizers (e.g. Tween 80, Po-' 
lysorbate 80), preservatives (e.g. Thimerosal, benzyl alcohol), 
adjuvants, antioxidants (e.g. ascorbic acid, sodium metabisulfi- 
te), emulsifiers, fillers (e.g. lactose, mannitol) , covalent 
bonds of polymers, such as polyethylene glycol, to the protein, 
incorporation of the material in particulate compositions of po- 
lymeric compounds, such as polylactic acid, polyglycolic acid, 
etc. or in liposomes, auxiliary agents and/or carrier substances 
which are suitable in the respective treatment. Such compositions 
will influence the physical condition, stability, rate of in vivo 
liberation and rate of in vivo excretion of the glycoproteins of 
the invention. 

The invention also provides a method of selecting DNA molecules 
which code for a pi, 2 -xylosyl transferase, in a sample, wherein 
the labelled DNA molecules of the invention or partial sequences 
thereof, are added to the sample, which bind to the DNA molecules 
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that code for a pi, 2 -xylosyl trans f erase . The hybridized DNA mole- 
cules can be detected, quant itated and selected. For the sample 
to contain single strand DNA with which the labelled dna mole- 
cules can hybridize, the sample is denatured, e.g. by heating. 

One possible way is to separate the DNA to be assayed, possibly 
after the addition of endonucleases , by gel electrophoresis on an 
agarose gel. After having been transferred to a membrane of ni- 
trocellulose, the labelled DNA molecules according to the inven- 
tion are admixed which hybridize to the corresponding homologous 
DNA molecule ( "Southern blotting" ) . 

Another possible way consists in finding homologous genes from 
other species by PCR-dependent methods using specific and/or de- 
generated primers , derived from the sequence of the DNA molecule 
according to the invention. 

Advantageously the labelled DNA molecules of the invention or 
partial sequences thereof are immobilized onto carrier matrices 
The use of DNA microarrays ("gene chips") is a further possible 
way to find homologous genes or to study the expression level of 
homologous genes. To this end, DNA representing either the entire 
genomic gene sequence, the full-length cDNA sequence, parts of 
these sequences or any combination of partial sequences, is immo- 
bilized onto carrier matrices, in order that homologous genes, 
after adding the sample to the carrier matrices, hybridize with 
the labelled DNA molecules (for examples see e.g. Ferea T.L. & 
Brown, P.O., 1999, Current Opinion in Genetics & Development 9: 
715-722 and references cited herein) . 

Preferably, the sample for the above-identified inventive method 
comprises genomic DNA of a plant organism. By this method, a 
large number of plants or other species is assayed in a very 
rapid and efficient manner for the presence of the Pl,2-xylosyl- 
transf erase gene. In this manner, it is respectively possible to 
select plants or individuals of other species which do not com- 
prise this gene, or to suppress or completely block, respective 
ly, the expression of the pi, 2-xylosyltransf erase in such plants 
or other organisms which comprise this gene, by an above-descri- 
bed method of the invention, so that subsequently they may be 
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used for the transfection and production of (human) glycopro- 
teins. 

The invention also relates to DNA molecules which code for a 
pl,2-xylosyl transferase which have been selected according to the 
three last-mentioned methods and subsequently have been isolated 
from the sample. These molecules can be used for further assays. 
They can be sequenced and in turn can be used as DNA probes for 
finding pi , 2 -xylosyl transferases. These - labelled - DNA mole- 
cules will function for organisms, which are related to the or- 
ganisms from which they have been isolated, more efficiently as 
probes than the DNA molecules of the invention. 

A further aspect of the invention relates to a preparation of 
pi, 2 -xylosyl transferase cloned according to the invention which 
comprises isoforms having pi values of between 6.0 and 9.0, in 
particular between 7.50 and 8.00. The pi values of a protein is 
that P H value at which its net charge is zero and is dependent on 
the amino acid sequence, the glycosylation pattern as well as on 
the spatial structure of the protein. The pi, 2-xylosyl transferase 
may comprise several isoforms which have a pi value in this 
range. The reason for the various isoforms of the transferase 
are, e.g., different glycosylations as well as limited proteoly- 
sis. The pi value of a protein can be determined by isoelectric 
focussing, which is known to the skilled artisan. 

The main isoform of the enzyme has an apparent molecular weight 
of 60,2 kDa. 

in particular, the preparation of the invention comprises an iso- 
form having a pi value of 7.52. 

The invention also relates to a method of preparing "plantif ied" 
carbohydrate units of human and other vertebrate glycoproteins or 
other glycoconjugates, wherein UDP-xylose as well as pi, 2-xylo- 
syl transferase encoded by an above-described DNA molecule are 
added to a sample that comprises a carbohydrate unit or a glyco- 
protein, respectively, so that xylose in pi, 2-position is bound 
by the pi, 2 -xylosyl transferase to the carbohydrate unit or to the 
glycoprotein, respectively. By the method according to the inven- 
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tion for cloning 01, 2 -xylosyl transferase it is possible to pro- 
duce large amounts of purified enzyme. To obtain a fully active 
transferase, suitable reaction conditions are provided. 

The invention will be explained in more detail by way of the fol- 
lowing examples and drawing figures to which, of course, it shall 
not be restricted, in detail, in the drawings, 

Fig. 1 shows the amino acid sequence of soybean peptide 2 and 3 
(patent W099/29835; SEQ ID NO: 3 and 5) , the homology between 
these peptides and a A. thaliana sequence as well as the dna se- 
quence of four primers 1-4; 

Pig. 2 shows the cDNA sequence of 01 , 2 -xylosyl trans f erase includ- 
ing 226 nt of the 5 ' -untranslated region; 

Fig. 3 shows the amino acid sequence of Pi, 2-xylosyl transferase 
derived therefrom; 

Figs. 4a, 4b and 4c show the alignment of A. thaliana pi, 2-xylo- 
syl transferase cDNAs , one genomic DNA and one EST sequence; 

Fig. 5 shows the alignment of amino acid sequences of pi 2-xylo- 
syl transferase derived from the cDNAs, from a genomic DNA and 
from a EST sequence; 

Fig. 6 is a schematic representation of the 01 f 2-xylosyl transfe- 
rase as well as the PCR-products and the hydrophobicity of the 
amino acid residues; 

Fig. 7 shows a comparison of the pi, 2-xylosyl transferase activity 
of msect cells transfected with the Pi, 2-xylosyl transferase gene 
with that of a negative control; 

Figs. 8a and 8b shows the structure of the acceptor substrate and 
product of the 01, 2-xylosyl transferase ; 

Fig. 9 shows mass spectra; 



shows the analysis of the pi, 2-xylosyl transferase product 
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by reversed-phase HPLC, and 

Fig. 11 shows the alignment of the predicted amino acid sequence 
derived from the cDNA of the present application with the amino 
acid sequence of the Pi, 2 -xylosyl transferase purified from soy- 
bean. 

Example 1: 
RT-PCR and cDNA cloning 

Primers for the amplification of the putative Pi, 2 -xylosyl trans- 
ferase cDNA by RT-PCR were designed as follows: A BLASTP search 
of the DDBJ database using two soybean peptides (SEQ ID NO: 1 and 
2; corresponding to SEQ ID NO: 3 and 5 in Fig. 4 in patent 
W099/29835 Al; however, the C-terminal amino acids LG were omit- 
ted from SEQ ID NO: 5) (see Fig.l) showed one protein sequence 
derived from a Arabidopsis thaliana genomic DNA sequence (Acc. 
Nr. AB015479) with more than 80 % homology (SEQ ID NO: 3). Prim- 
ers 3 (SEQ ID NO: 4) and 4 (SEQ ID NO: 5) were based on the A. 
thaliana sequence homologous to the soybean peptides 2 and 3. 
Analysis of the homologous genomic DNA sequence using Gene-Finder 
at the BCM Search Launcher resulted in one predicted protein. 
Primer 1 (SEQ ID NO: 6) was designed to include the start codon 
of the predicted protein, whereas primer 2 (SEQ ID NO: 7) con- 
tains the stop codon of the predicted protein. 

The entire RNA was isolated from young leaves of Arabidopsis 
thaliana var Columbia using the TRizol reagent (Life Technolo- 
gies) . The RNA was treated with DNAse (Promega, RQ1 RNase Free 
DNase) to remove traces of genomic DNA. First-strand cDNA was 
synthesised from 1 jig of total RNA at 42°C using oligo(dT) prim- 
ers (Sigma) and AMV reverse transcriptase (Promega) . 

The first strand cDNA was subjected to a PCR, wherein different 
combinations of sense and antisense primers were used (illustra- 
ted in Fig. 6) : The product of primer 3 and primer 4 was a DNA 
fragment with length of 174 bp (PI), the product of primer 1 and 
primer 2 was a 1605 bp (P2) DNA fragment, the product of primer 1 
and primer 4 was a DNA fragment with length of 1525 bp (P3) and 
primer 3 and primer 4 produced a DNA of 254 bp (P4) . For amplifi- 
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cation of the putative open reading frame primer 1 and primer 2 
were used. A PCR reaction contained in a total volume of 50 jil 
0.2 pmol of each primer, 0.05 mM dNTPs, 2 mM MgS0«, 20 mM Tris- 
HC1 ( P H 8.2 at 25°C), 10 mM KC1, 10 mM (NH«) 2 SO<, 0.1 % Triton X- 
100, 5 ug nuclease-free BSA and 2.5 units Pfu DNA polymerase from 
Promega. After a first denaturing step at 94°C for 2 min, 30 cy- 
cles of 1 min at 92°C, 40 sec at 53*C and 3 min and 30 sec at 72°C 
were performed. The last extension step was carried out at 72°C 
for 8 min. PCR products were subcloned into Smal linearised and 
dephosphorylated pucl9 vector, and sequenced: The sequences of 
the subcloned fragments were obtained by means of the didesoxynu- 
cleotide method (ABI prism Dye Termination Cycle Sequencing Ready 
reaction Kit and ABI PRISM 310 Genetic analyzer from Perkin El- 
mer) . as a result of the RT-PCR three slightly different cDNA se- 
quences were obtained. The sequence of cDNA 6 which has a size of 
1605 bp (xt-Ath6; SEQ ID NO: 8) and codes for a protein of 534 
amino acids having a molecular weight of 60,2 kDA and a theore- 
tical pi value of 7,52 (see Fig. 3) is identical to the nucleo- 
tide sequence derived from the genomic clone after removing of 
two introns (xt-Athgen.seq) . cDNA 9 shows 4 base pair changes 
compared to cDNA 6, whereas cDNA 16 shows 6 base pair changes 
compared to cDNA 6 {illustrated in Figs. 4a, 4b and 4c). There- 
fore the amino acid sequence derived from cDNA 9 comprises two 
changes compared to the amino acid sequence derived from cDNA 6 
(SEQ ID NO: 8), and the amino acid sequence of cDNA 16 shows four 
changed residues (illustrated in Fig. 5). 

Fig. 3 shows the cDNA-derived amino acid sequence (SEQ ID NO: 9) 
of the pl,2-xylosyl transferase (xt-Ath6; SEQ ID NO: 8). Potential 
sites for the asparagine-bound glycosylation are at Asn51, Asn301 
and Asn479. 

Figs. 4a, 4b and 4c show the alignment of pi, 2-xylosyltransf erase 
nucleotide sequences from the A. thaliana cDNA 6 (xt-Ath6; SEQ ID 
NO: 8), the A. thaliana cDNA 9 (xt-Ath9) , the A. thaliana cDNA 16 
(xt-Athl6), the A. thaliana genomic DNA sequence after removing 
of two introns (xt-Athgen) , and from a A. thaliana EST sequence 
(xtAthEST) . The dotted line stands for the consensus sequence; 
the dashed line for a gap. 
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The genomic sequence (xt-Athgen; Acc. No. AB015479, start codon 
at position 58185-58187, stop codon at position 60214-60216 of 
the genomic DNA) results from removing pf two putative introns 
(intron 1: from position 58881 to 59116 of the genomic DNA; in- 
tron 2: from position 59268 to 59458 of the genomic DNA) using 
the splice site prediction' server NetPlantGene. The A. thaliana 
EST sequence (xt-AthEST; Acc. No. AI994524) is the result of a 
database search using BLASTN. 

Fig. 5 shows the alignment of amino acid sequences from pl,2-xy- 
losyltransf erase derived from A. thaliana cDNA 6 (xt-Ath6; SEQ id 
NO: 9), from A. thaliana cDNA 9 (xt-Ath9), from A. thaliana cDNA 
16 (xt-Athl6), from the A. thaliana genomic sequence (xt-Athgen), 
and derived from a A. thaliana EST sequence (xt-AthEST) . The dot- 
ted line stands for a consensus sequence; the dashed line stands 
for a gap. 

In Fig. 6, the schematic predicted Pi, 2-xylosyltransf erase pro- 
tein (top) and the derived hydrophobicity index using ProtScale, 
of the encoded protein (bottom) are illustrated, a positive hy- 
drophobicity index meaning an increased hydrophobicity. Therebe- 
tween the sizes of the four above-indicated PCR products (P1-P4) 
are shown in relationship to the cDNA. "C" coding for the postu- 
lated cytoplasmic region, »T" for the postulated transmembrane 
region, and "G" for the postulated Golgi lumen catalytic region 
of the transferase. The analysis of the protein sequence by 
"TMpred" (from EMBnet, ) gave an assumed transmembrane region from 
Hell to Phe29. The C-terminal region of the enzyme probably com- 
prises the catalytic region and consequently should point into 
the lumen of the Golgi apparatus. According to this, this trans- 
ferase seems to be a type II transmembrane protein like all the 
hitherto analysed glycosyl transferases which are involved in gly- 
coprotein biosynthesis (Joziasse, 1992, Glycobiology 2, 271-277). 
The grey regions represent the position of the two peptides, the 
hexagons represent the potential N-glycosylation sites. A BLAST 
search in data banks accessible via NCBI showed only high homolo- 
gy to one other plant sequence (hybrid aspen, Acc. No. AI62640) . 



Example 2 : 

Expression of recombinant pi, 2-xylosyltransf erase in insect cells 
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The entire coding region of the assumed pl,2-xylosyl transferase 
including cytoplasmatic and transmembrane region was removed from 
the pucl9 vector by BamHI and EcoRI digestion and subcloned into 
BamHI / EcoRI digested and dephosphorylated baculovirus transfer 
vector pVL 1393 (PharMingen, San Diego, CA) . Correct cloning was 
confirmed by sequencing using pVL1393 forward primer 5'- 
AACCATCTCGCAAATAAATAAGTA-3 ' (SEQ ID NO: 10) and pVL1393 reverse 
primer 5'-GTCGGGTTTAACATTACGGATTTC-3 » (SEQ ID NO: 11) . To ensure 
a homologous recombination, 1 Hg of the transfer vector was co- 
transfected with 200 ng linear Baculo-Gold viral DNA (PharMingen 
San Diego, CA) into 1 x 10* Sf-9 cells in IPL-41 Medium using Li- 
pofectm (Life Technologies) according to the manufacturer's pro- 
tocol. After an incubation of 5 days at 27° c, various volumes of 
the supernatant with the recombinant virus were used for infec- 
tion of Sf-21 insect cells. Cells were incubated in IPL-41 medium 
supplemented with 5 % heat-inactivated fetal calf serum for 4 
days at 27° C, then harvested and washed 2x with phosphate-buff- 
ered saline solution. The cells were resuspended and homogenised 
in the following buffer (1 ml per 10 7 cells) : 100 mM MES buffer, 
PH 7.0, with 1 % Triton X-100, 1 mM DTT, 1 mM PMSF, 5 ug/ml Leu- 
peptin (Sigma), 5 ug/ml E-64 (Serva) and incubated on ice for 30 
min . 

Example 3: 

Assay for pi, 2-xylosyltransf erase activity 

The cell homogenates were assayed for pi, 2-xylosyltransf erase 
activity. Negative controls were carried out with the same number 
of uninfected cells. The assay mixtures contained, in a total 
volume of 20 ul, 13 ul of homogenised cells, 2 nmol dabsylated 
GnGn hexapeptide or GnGn-pyridylamine as acceptor substrate (Fig. 
8a), 1 mM UDP-xylose as donor substrate, 10 mM ATP, 20 mM MnC12 
and 1 mM 2-acetamido-l, 2-dideoxy-nOjirimycin was included to pre- 
vent degradation of product by N-acetylhexosaminidase. The sam- 
ples were incubated for 1 hour at 37° c and analysed by MALDI-TOF 
mass spectrometry. 



Fig. 7 shows the measured enzyme activity of the recombinant 
Pi, 2-xylosyltransf erase as well as of the negative control. Grey 
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bars show the activity when GnGn hexapeptide was used as a sub- 
strate, whereas black bars indicate the use of GnGn-pyridylamin 
as a substrate. The enzyme activity of the cotransfected cells 
was 3 Ox higher than that of the negative controls. 

The structure of the acceptor substrate of 01, 2 -xylosyl transfe- 
rase is shown in Fig. 8a, and the postulated product in Fig. 8b 
where R represents either a pyridylamine or dabsylated hexapep-' 
tide residue. 

Example 4 s 

Mass spectrometry of the xylosyltransferase product 

Mass spectrometry was performed on a DYNAMO (Bio-Analysis, Santa 
Fe, NM) , a MALDITOF MS which is capable of dynamic extraction 
(synonym for late extraction) . Two types of sample matrix prepa- 
rations were used: dabsylated glycopeptides were dissolved in 
5 % formic acid, and aliquots were applied to the target, air- 
dried, and covered with 1 % alpha-cyano-4-hydroxy cinnamic acid. 
Pyridylaminated glycans were diluted with water, applied to the 
target and air-dried. After addition of 2 % 2 . 5-dihydroxy benzoic 
acid, the samples were immediately dried by applying vacuum. 

Fig. 9 shows the mass spectrum of these samples, (A) being the 
negative control: The main peak (S) shows the dabsyl-Val-Gly-Glu- 
(GlcNAc4Man3)Asn-Arg-Thr substrate, the calculated [M+H]* value 
being 2262.3. This substrate also appears as sodium addition pro- 
duct and as smaller ion which has been formed by fragmentation of 
the Azo function of the dabsyl group, at (S*) . The peak at m/z = 
2424.4 shows the incomplete de-galactosylation of the substrate. 
The mass spectrum (B) shows the sample with recombinant 01,2-xy- 
losyltransf erase after incubation for 1 h at 37° c. The main peak 
(P), having a [M+H] * value of 2393.4, represents the xylosylated 
product . 

Example 5: 

HPLC-analysis of the xylosyltransferase product 

Xylosyltransferase assays were performed as described above under 
example 3 except that 10 nmol of GnGn-pyridylamine were used as 
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the acceptor substrate. After 4 h of incubation the sample was 
analyzed both by MALDI-TOF mass spectrometry and by rever sed- 
phase HPLC to verify the structure of the product. The presumed 
product peak eluting slightly ahead of the substrate GnGn-PA was 
collected. By MALDI-TOF MS the product's mass was determined to 
be 1550.9 which is in good agreement with being GnGnX-PA . Upon 
digestion with p-N-acetylglucosaminidase from bovine kidney (in 
50 mM sodium citrate buffer of pH 5.0 for 20 h at 37°C with 25 mU 
of enzyme) , the glycan eluted with about the retention of MM. 
This is in keeping with published data on the retention of MM-PA 
and MMX-PA (Wilson & Altmann, 1998, Glycocon j . J. 15, 1055-1070). 
Further digestion with alpha-mannosidase from jack beans under 
the chosen conditions (20 h at 37°C with 50 mU of enzyme) resul- 
ted in the appearance of two new peaks. As the alpha-1, 3-linked 
mannose is considerably more sensitive to mannosidase than the 
alpha-1, 6-linked mannose, the peaks are assigned to 00X and M0X 
(in the order of elution) . Indeed, MOX-pyridylamine prepared from 
bromelain by defucosylation with acid coeluted with the presumed 
M0X derived from the xylosyl transferase product. The fairly high 
shift of elution time due to the removal of the alpha-1, 3-linked 
mannose residue is a strong indication of the p-mannose being 
substituted by xylose (Wilson & Altmann, 1998, Glycocon j . J. 15, 
1055-1070; Altmann, 1998, Glycocon j . j. 15, 79-82). 

Fig. 10 shows the analysis of the xylosyl transferase product by 
reversed-phase HPLC. (A) transferase incubation mixture; (B) iso- 
lated xylosyl transferase product; (C) isolated xylosyl transferase 
product after digestion with p-N-acetylglucosaminidase; (D) iso- 
lated xylosyl transferase product after further digestion with al- 
pha-mannosidase. The assignments of peaks are as follows: 1, 
GnGn-PA; 2 , GnGnX-PA; 3, MMX-PA; 4, M0X-PA; 5, 00X-PAA; 6, MO-PA 
(from trace of substrate in isolated product) . For abbreviations 
of N-glycan structures see Wilson I.B.H. and Altmann, F., 1998, 
Glycoconj. J. 15, 1055-1077. 

Fig. 11 shows the alignment of the predicted amino acid sequence 
according to the WO 99/29835 Al. This alignment shows that the 
amino acid sequence of the purified soybean enzym corresponds 
only to amino acids 199-469 of the sequence derived from the cDNA 
according to the present invention. Furthermore, the predicted 
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amino acid sequence derived from the cDNA of the present applica- 
tion contains two insertions (corresponding to aa 375-382 and aa 
425-429 of the predicted sequence) compared to the sequence of 
the purified soybean enzyme. 
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Claims : 

1. A DNA molecule, characterized in that it comprises a sequence 
according to SEQ ID NO: 8 with an open reading frame from base 
pair 227 to base pair 1831, or is at least 50% homologous with 
the above sequence, or hybridizes with the above sequence under 
stringent conditions, or comprises a sequence which has degenera- 
ted to the above DNA sequence due to the genetic code, with the 
sequence coding for a plant protein having xylosyl transferase ac- 
tivity or being complementary thereto. 

2. A DNA molecule according to claim 1, characterized in that it 
codes for a protein having 01, 2-xylosyl transferase activity. 

3. A DNA molecule according to claims 1 or 2, characterized in 
that it is at least 70%, preferably at least 80%, particularly 
preferably at least 95% homologous with the sequence according to 
SEQ ID NO 8. 

4 . A DNA molecule according to any one of claims 1 to 3 , charac- 
terized in that it comprises 1780 to 1880, particularly 1831 base 
pairs. 

5. A DNA molecule according to any one of claims 1 to 4, charac- 
terized in that it is covalently associated with a detectable 
marker substance. 

6. A DNA molecule according to any one of claims 1 to 5, charac- 
terized in that said DNA sequence comprises a deletion, insertion 
and/or substitution mutation. 

7. A DNA molecule coding for a ribozyme, characterized in that it 
has two sequence sections, each of which has a length of at least 
10 to 15 base pairs and which are complementary to the sequence 
sections of a DNA molecule according to any one of claims 1 to 5 
so that said ribozyme complexes and cuts the mRNA transcribed by 
a natural pi, 2-xylosyl transferase DNA molecule. 



. A biologically functional vector, characterized in that it 
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comprises a DNA molecule according to any one of claims 1 to 5. 

9. A biologically functional vector, characterized in that it 
comprises a DNA molecule according to any one of claims 1 to 5 
being inversely orientated with respect to the promotor. 

10. A biologically functional vector, characterized in that it 
comprises a DNA molecule according to claims 6 or 7. 

11 . A method of preparing a cDNA comprising a DNA molecule ac- 
cording to any one of claims 1 to 4, characterized in that RNA is 
isolated from plant cells, particularly from leaf cells, and with 
said RNA a reverse transcription is effected after the addition 
of a reverse transcriptase and primers. 

12. A method of cloning a pi, 2-xylosyl transferase, characterized 
in that a DNA molecule, according to any one of claims 1 to 4 is 
cloned into a vector subsequently transfected into a host cell or 
a host, with cell lines being obtained by means of selection and 
amplification of transfected host cells, which cell lines express 
the active pi, 2-xylosyl transferase. 

13. A method of preparing recombinant host cells, particularly 
plant cells, or plants, wherein the production of pi, 2-xylosyl - 
transferase is suppressed or completely stopped, characterized in 
that at least one of the vectors according to one of the claims 8 
to 10 is inserted into said host cell or plant, respectively. 

14. A method of preparing recombinant host cells, particularly 
Plant cells or plants, respectively, characterized in that the 
DNA molecule according to claim 6 is inserted into the genome of 
said host cell or plant, respectively, at the position of the 
non-mutated, homologous sequence. 

15. Recombinant plants or plant cells, characterized in that they 
are prepared according to a method according to claims 13 or 14 
and that their pi, 2-xylosyl transferase production is suppressed 
or completely stopped. 

16. A pna molecule, characterized in that it comprises a base se- 
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quence complementary to the sequence of a DNA molecule according 
to any one of claims 1 to 4. 

17. A PNA molecule, characterized in that it comprises a base se- 
quence corresponding to the sequence of a DNA molecule according 
to any one of claims 1 to 4 . 

18. A method of producing plants or plant cells, respectively, 
particularly plant cells having blocked expression of pi,2-xylo- 
syltransf erase at the transcription or translation level, charac- 
terized in that PNA molecules according to claims 16 or 17 are 
inserted into the cells. 

19. A method of producing recombinant glycoproteins, characteri- 
zed in that the system according to claim 15 or plants or cells, 
respectively, which are prepared according to a method according 
to claim 18, is (are) trans fee ted with the gene that expresses 
the glycoprotein so that the recombinant glycoproteins are ex- 
pressed. 

20. A method of producing recombinant human glycoproteins, cha- 
racterized in that the system according to claim 15 or plants or 
cells, respectively, which are prepared according to a method ac- 
cording to claim 18, is (are) transfected with the gene that ex- 
presses the glycoprotein so that the recombinant glycoproteins 
are expressed. 

21. A method of producing recombinant human glycoproteins for 
medical use, characterized in that the system according to claim 
15 or plants or cells, respectively, which are prepared according 
to a method according to claim 18, is (are) transfected with the 
gene that expresses the glycoprotein so that the recombinant gly- 
coproteins are expressed. 

22. Recombinant glycoproteins, characterized in that they are 
prepared according to the method according to claim 19 in plant 
systems and that their peptide sequence has less than 50%, par- 
ticularly less than 20%, particularly preferably 0% of pl,2-bound 
xylose residues present in proteins expressed in non-xylosyl- 
transf erase reduced plant systems. 
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23. Recombinant human glycoproteins, characterized in that they 
are prepared according to the method according to claim 20 in 
plant systems and that their peptide sequence has less than 50%, 
particularly less than- 20%, particularly- -preferably 0% of 01,2- 
bound xylose residues present in proteins expressed in non-xylo- 
syl transferase reduced plant systems. 

24. Recombinant human glycoproteins for medical use, characteri- 
zed in that they are prepared according to the method according 
to claim 21 in plant systems and that their peptide sequence has 
less than 50%, particularly less than 20%, particularly prefera- 
bly 0% of pl,2-bound xylose residues present in proteins expres- 
sed in non-xylosyl transferase reduced plant systems. 

25. A pharmaceutical composition, characterized in that it com- 
prises recombinant glycoproteins according to any one of claims 
22 to 24. 

26. A method of selecting DNA molecules coding for a pl,2-xylo- 
syl transferase, in a sample, characterized in that DNA molecules 
according to claim 5 are added to said sample, which molecules 
bind to the DNA molecules coding for a pi, 2-xylosyl transferase. 

27. A method according to claim 26, characterized in that said 
sample comprises genomic DNA of a plant or non-vertebrate animal 
organism. 

28. DNA molecules coding for a Pi, 2-xylosyl transferase, charac- 
terized in that they are selected according to the method accord- 
ing to claims 26 or 27 and are subsequently isolated from the 
sample. 

29. A preparation of Pi, 2-xylosyl transferase cloned according to 
a method according to claim 12, characterized in that it has iso- 
forms having pi values of between 6.0 and 9.0, particularly be- 
tween 7.50 and 8.00. 



30. A preparation according to claim 29, characterized in that it 
has an isoform having a pi value of 7,52. 
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31. A method of preparing plantified carbohydrate units of human 
and other vertebrate glycoproteins, characterized in that to a 
sample comprising a carbohydrate unit or any glycocon jugate or a 
glycoprotein, respectively, are added UDP-xylose and 01,2 -xyl o- 
syl transferase coded by a DNA molecule according to any one of 
claims 1 to 5 so that xylose is bound to said carbohydrate unit 
or glycocon jugate or said glycoprotein, respectively, at the 
01,2-position by said pi, 2-xylosyltransf erase. 

32. The use of DNA according to any one of claims 1 to 6 or par- 
tial sequences or of combinations of partial sequences for immo- 
bilisation on DNA microarrays, e.g. for finding homologous se- 
quences or for expression studies in plants or non-vertebrate 
animals . 



KP/Se 
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Peptide 3 from patent W09929835A1 SQVQAIHDASVnGAHGAGLTHIVSAL 
Peptide 2 from patent W09929835A1 1 GLEYHAIN 



Soybean: SQVQAIHDASVnGAHGAGLTHIVSAL '■ '. GLEYHAIN 

QV AI DASVUGAHGAGLTHIVSA GLEYHA 
Athaliana: DQVRAiQDASVnGAHGAGLTHTVSATPhTITIFEnsVEFQRPHFELM 



Primer 1 5 -ATGAGTAAACGGAATCCGAAG-3 ' 
Primer 2 5 '-TTAGCAGCC AAGGCTCTTCAT-3 ' 

Primer 3 5 '-G ATCAAGTCCGAGCC ATTCAA-3 ' 
Primer 4 5 '-CGCGTGATACTCCAATCCTTT-3 ' 

'The C-terminal amino acids LG were omitted 



Fig. 1 
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AAATCTGCAGACTCTCAAAATTCCGATTCATCTTATTGAAGAACAA 4 6 
TTTTCCGGCGAAACAGCCGATGAAGTOTCGCCTGAATCTTCTGTACCTTTCACCGGCGAT 106 
TGACTTGACTTCAGAATCGAGAGAGAAGAAATCGATGGAAAACTAA^ 166 
TCAAATTCTCGCTCTCTCTTCAAAACCGCAAATCAAGGGAACGAGAGACGAGAGAGAGAG 226 

ATGAGTAAACGGAATCCGAAGATTCTGAAGATTTTTCTGTATATGTTACT 286 
CTCTTTCTCATCATCTACTTCGTTTTTCACTCATCGTCGTT^ 3 4 6 

CCTCCTCATATATACCACGTTTCAGTGAATAACCAATCGGCGATTCAGAAACCGTGGCCG 406 
ATCTTACCTTCTTACCTCCCATGGACGCCGCCGCAGAGGAATCTACCAACTGGCTCCTGC 466 
GAAGW3TTACTTCGG GAATGGATTTAC AAAGAGAGTTGACTTC CTTAAGCCGAGG ATTGGA 526 
GGAGGAGGAGAAGGAAGCTGGlTCCGATGTTTTTACAG 586 
TGTGAAGGAAGGAATCTGAGAATGGTTCCGGATCGGATTGTTATGTCGAGAGGAGGTGAG 64 6 
AAGTTAGAGGAAGTTATGGGGAGGAAAGAGGAGGAGGAGCTTCCT 7 0 6 

GCGTTTGAGGTAGCGGAAGAGGTTTCTTCACGGTTAGGTO 766 
GGTGGAGGAGAAGGAGGTAGTGCGGTTTCTCGGCGGCTGGTGAATGATGAGATGTOGAAT 82 6 
GAATATATGCAAGAAGGTGGAATTGATAGACATACAATC 886 
CGTGCTGTTGATACCAATGATTTCGTTTGTC 94 6 

GTCACTAGATTCGAGTACGCAAATCTCTTCCATACT 1006 
GTTTCGTCTAGAGTCACCGGTTTGCCTAATCGACOTCACGTTGTTTTCGTTG 1066 
TGCACGACGCAGCTAGAAGAAACATGGACAGCTTTGTTTTCCGG 1 12 6 

AACTTCACCAAACCGGTTTGTTTCCGCCACGCGATTCTTTCACC 1186 
GCTCTTTTTAAAGGCTTGTCCGGAGAAAT^ 1246 
TGGC AAAACC CGGACG ATAAAAGG AC TGCG AGGATATC AGAGTTTGGTGAAATGATCAGA 1306 
GCAGCTTTCGGKSTTGCCTGTCAATAGACACCGCTCATTAGT^AAAGCCGCTATCATCAT^ 1366 
TCATCATCAGCCTC^GTTTATAATGTTCTTTTTGTCCGCCGTGAAGA 1426 

CCTCGTCATGGCGGTAAAGTCCAGTCTCGGCTC^TCAATGAGGAAGAAGTGTTCGACTCG 1486 
TTGCATCATTGGGTTGCAACTGGGTCCACCG 1545 

AATGGCTTGCTTGCACACATGTGAATGAAAGATCAAGTCCGAG 1606 
GTGATCATAGGAGCTCATGGAGCAGGACTGACTCACATTGTCTCTGCAACAC 1666 
ACGATATTTGAGATAATAAGCGTCGAGTTTCAGAGACCTCATTTCGAGCTTATAGCTAA^ 1726 
TGGAAAGGATTGGAGTATCACGCGATGCATCTGGCGAACTCACGAGCGGAACCAACGGCT 1786 
GTGATTGAGAAGTTAACGGAGATCATGAAG AGCCTTGGCTGCTAA 1831 



Fig. 2 
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10 20 30 40 50 60 

MSKRNPKILKIFLYMLLLNSLFLIIYFVFHSSSFSPEQSQPPHIYHVSVNNQSAIQKPWP 

70 80 90 100 110 120 

I LPS YLPWTPPQRNLPTGSCEGYFGNGFTKRVDFLKPR I GGGGEGS WFRCFYSETLQSS I 

130 140 150 160 170 180 

CEGRNLRMVPDRIVMSRGGEKLEEVMGRKEEEELPAFRQGAFEVAEEVSSRLGFKRHRRF 

190 200 210 220 230 240 

GGGEGGSAVSRRLVNDEMLNEYMQEGGIDRHTMRDLVASIRAVDTNDFVCEEWVEEPTLL 

250 260 270 280 290 300 

VTRFEYANLFHTVTDWYSAYVSSRVTGLPNRPHVVFVDGHCTTQLEETWTALFSGIRYAK 

310 320 330 340 350 360 

NFTKPVCFRHA I LS P LGYETALFKGLSGE I DCKGDS AHNLWQNPDDKRT AR I S EFGEM I R 

370 380 390 400 410 420 

AAFGLPVNRHRSLEKPLSSSSSSASVYNVLFVRREDYLAHPRHGGKVQSRLINEEEVFDS 

430 440 450 460 470 480 

LHHWVATGSTGLTKCGINLVNGLLAHMSMKDQVRAIQDASVIIGAHGAGLTHIVSATPNT 

490 500 510 520 530 

TIFEIISVEFQRPHFELIAKWKGLEYHAMHLANSRAEPTAVIEKLTEIMKSLGC 



Fig. 3 
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£ i ATCACTAAAC ^ TCC01AC ^ AACAT ^""ATCTTACrrC TC AiCTCTC TC TTrC TC A TC * TCTACT T : 80 

^ C4 

xt-Athi6.seq : - : 80 

xt-Athgen.aeq : : 60 

xt-AthEST.seq : Hill • 80 



jct-ithis.scq : ; ;;;;;; A : i«> 

xt-Atho*n. a eq : : 160 

xt-Ath£ST.oeq : LL11L*,* * : 160 



• ?? A ™ 6AAA ?? c y?ff? 6A y!™ < *«> 

xt-Athl6.seq : "..WW" ' « 2« 

xc-Atho«.aeq : ;" * 240 

xt-AthEST.aeq : : 240 



Ke-A t h6. 8 eq : CAACCTTACTTCCCCIATCCATTTACAAAGACACTTCACTTCCTTAACCCCACCATTCCICCACCAOCAGAACCA 



Xt-Ath9.8eq t — .^^ACCAOOACAACCAACCTC : 320 

«t-Athl«..e, : ; : 320 

xc-Athgen.aeq s : 320 

xt-AthTST.aeq : ' : 320 



xt-Athl6.aeq : 1 . !! : 400 

xt-Ath*en. OC q : : 400 

xt-A.thE3T.aeq : -ILL s * 00 



£-££'£2 : ™ GTCGkGkGGXGGTGkGXLG ^ XCk ^^ : 480 

xt-Athl6.seq : k : 480 

xt-Athgen.aeq : : 480 

xt-A.thEST.seq : LLLLLLLLLLLLLLLLLLL : 480 



xtiitl^:,:, ; cc ^ CA ^ T *^ A * c * cc ™™ cw ™*™" s 560 

xt-Athl6.se, : """" i^"."".'"".".' ! 5 « 

xt-Athoen.aeq : : S60 

xt-AthtST.aeq : 11111111 ! 560 



Fig. 4a 
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icc.lthl6.eieq : • : 610 

xt-Athoen.aeq : ... r V : 640 

xt-AthEST.seq : --lilllLl * : 640 



xt-xthi^, • 1 720 

xt-Aehoen.eeq .• 720 

xt-AthEST.« q « — --— ' : 720 



72 



: ™™ a ™: c ™^ t ^^ , eoo 

xc-Achl6.seq r " : 800 

xc-lthffcn.acq : * • • . • A s 800 

xt-AthEST.seq : .".V."]'. : 600 

• : 152 

: , eeo 

«-x t hi6..e, ::::::::::::::::::::::: 1 «» 

xt.Athoen.ae, , , 680 

xt-AthEST.seq : : 080 

: 232 

SSSM : ???"™™?????^ . «» 

Xt-Athl6.seq : C.\".\".\\" 5 96 ° 

xt-Athoen.aeq : : 960 

x t . AthE 3T.Be q : ; ; ; ; ; ; ; • ; ; ; ; ; ; ; - , 9*0 



xt.-Ath6.seq 
xt-Ath9.oeq 
X t-At hi 6. scq 
xt-Athgen.oeq 
xt-AthEST.seq 



CCTCmn AAAGGCTTGTCC 



CCGOAGAAATAG ACTGC A AGCG 1C ATTC AGCTC AC A ATC TGTGGC A A AACO 
•••••••••••••••a*. C... 



rCCOACOATAA 



..C. 



: 1040 

: 1010 

: 1040 

: 1040 

: 392 



xt-Ath£.seq 
xt-Ath9.oeq 
xt- At hi 6. acq 
xt-Athgen.aeq . 
Kt-AthE3T.aeq : 



aac^actcccagcatatcacactttcctoaaatcatcacaccacctttcc<x;ttccctctc 



A AT AC AC ACCCCTC ATT AG : 1120 



• • • • C« • . »c . 



.GTC TA. 



. . : 1120 
.. : 1120 
: 1120 
.C : 472 



Fig. 4b 
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xt-Ath6.aeq 

xt-.lth9.3eq 

xt~Athl6.seq 

xt-Athqen.aeq 

xt-AthEST.aeq 



AAAAGCCGCTATC ATC ATC ATC ATC ATCAGCCTC AGTTTATAATGTTCTTTrTCTCCGC CG TC AAG ATTACTTAGCCCAT : 1200 

° : 1200 

: 1200 

* H • : 1200 



xt-Ath6.aeq 

xt-Ath9.oeq 

xt-Athl6.seq 

xc-Atlvgen.aeq 

xt-AchZST.aeq 



CCTCGTCATGCCGCTAAAGTCCAGTCTCGGCTCATCAATGA^ 



: 12 BO 

: 1280 

: 1260 

: 1280 

: 



Xt-Ath6. aeq s TGGGTCC ACCGXJ TCTGACC AAATG^CGGATTAACCTTGTGA ATGGCTTGC TTGCAC AC ATGTC AATGAA AGATC AACTCC ■ 1360 

xt-Ath9.aeq : \ l36Q 

xt-Athl6.aeq : . 136Q 

Xt -t t ^" q S * V. : 1360 

xt-AthEST.aeq : ____ 



XC-Ath6.aeq : G AGCCATTC AAG ATGCTTC AGTGATCATAGGACCTC ATGGAGCAGCACTG ACTCAC ATTGTCTCTGCAAC ACCAAACAC A • 1440 

xt-Ach9.aeq : J 

xt-Athl6.seq : 

^T^ ^ ' 2 "«° 

xc-AthrST.seq : m 



Xt-Ath6.aeq : ACCATATTTGAG ATA ATA AGCGTCC AGTTTC ACAC ACCTC ATTTCGAGC TTATAGC TA AGTGGA AAGGATTCG AGTATC A : 1S20 

xt-Ath9.seq : . JS2Q 

xt-Athl6. aeq : ; i 1520 

xt-Athgea.oeq : *" ** ] lS2Q 

xt-.AthE3T.scq : . I'"**"''* * *"* *' [ 



xt-Ath6.aeq : CGCGATGCATCTGGCGAACTCACGAGCGCAACCAACCCCTGTGATTGACAAGTTAACGGAGATCATGAAGAGCCTTGGCT • 1600 

xt-Ath9.acq : . l6QQ 

xt-Athl6.aeq : G , . 1icnrt 

. ^ . • •••••••••••••••••• low 

xt-Athgen.seq : . l6QQ 

xt-AthEST.aeq : - 1111111111 • 



xt-Ath6.aeq : GCTAA : 1605 

xt-Ath9.oeq : : 1605 

xc-rAthl6. acq : . : 1605 

xt-Achgen.seq : : 1605 

xt-AthEST.aeq : : 



Fig. 4c 
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: ra ™ PK1LKirt ™ L ^ LFLII ™ rassrspE < B ° p ™"^^ , eo 

xt-icny.ueq 

Xt-Athl6.seq : • 80 

xt-Ach*en.seq : * *> 

xt-Ath£ST.seq : 111111111121111111111^*^^111111^111 ; 80 

; f°™ cr ™ RVDrLKPW ^ . i«) 

xt^thii.scq i : $ « 

xt-Athgen.seq : "^...."..'i : 160 

xt-AchEST.aeq : 111111111111111" 1 160 

. : - 

xt-££:«q I ™ XZ ™ 5S ™ CT ™ B *^ : 240 

xt-.lthi6.seq : : 2 *° 

xt-i.thgen.seq : * " : 240 

xt-AthEST.seq : 111111111111" s 240 

; 24 



xt-Ath6.3eq 
xt-A.th9.seq 
xt-Athl 6. seq 
xt-Athqen.seq 
xt-AthEST. seq 



VTRTZYANLrim^DWySAWSSRVTCLPNRPHWrVr^HCTTOLEETUTALFSGIRyAra : 



.. . R. 



320 
: 920 
: 320 
: 320 
: 104 



xt^tht'seq ! ^" G ^ G " D ?R GDS ^ tf ^ PDDKRTARISEr " : 400 

xt-Athl6.seq : [[ ,]..[[[.. . [ [ [ [ [ : 400 

xt-Athqen. seq : * : 

xt-*thEST., e «, : • ; ;™ 

-J S; ' : : 480 

xt-Athqen.seq : • «*° 

xt-AthEST.oeq : 11111111111111111111111 ; 480 



Xt-Ath6.seq t TIFEIISVZrQRPHrtLIAKWKCLCYHABHLANSRAEPTAVIEKLTEIirKSLCC • S3* 
xt-Ath9.seq : ] 

xt ^K 16 * 8eq 2 ; 534 

xt-Athqen.seq : 

xt-AthEST.seq : 111111111111111111111111 ! 



Fig. 5 
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SEQUENCE LISTING 
<110> Glossl Prof., Josef 

<120> xylosyltransferase-gene 

<130> xylosyltransferase-gene 

<140> 
<141> 

<160> 11 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 27 
<212> PRT 
<213> soyabean 

<400> 1 

Ser Gin Val Gin Ala He His Asp Ala Ser Val He He Gly Ala His 

15 

Gly Ala Gly Leu Thr His He Val Ser Ala Leu 
20 25 



<210> 2 
<211> 7 
<212> PRT 
<213> soyabean 



<400> 2 

Gly Leu Glu Tyr His Ala He Asn 
1 5 



<210> 3 
<211> 60 
<212> PRT 

<213> arabidopsis thaliana 



<400> 3 



Asp Gin Val Arg Ala He Gin Asp Ala Ser Val He He Gly Ala His 
1 

Gly Ala Gly Leu Thr His He Val Ser Ala Thr Pro Asn Thr Thr He 



1 



WO 01/64901 

PCT/EP01/02352 



20 



25 30 



Phe Glu He He Ser Val Glu Phe Gin Arg Pro His Phe Glu Leu lie 
35 40 45 

Ala Lys Trp Lys Gly Leu Glu Tyr His Ala Met His 
50 55 60 



<210> 4 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : primer 
<400> 4 

gatcaagtcc gagccattca a 

<210> 5 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
<400> 5 

cgcgtgatac tccaatcctt t 

<210> 6 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
<400> 6 

atgagtaaac ggaatccgaa g 

<210> 7 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 



21 



21 



21 



2 
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<223> Description of Artificial Sequence : primer 
<400> 7 

ttagcagcca aggctcttca t 

<210> 8 
<211> 1831 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 8 

aaatctgcag actctcaaaa ttccgattca tcttattgaa gaacaatttt ccggcgaaac 60 
agocgatgaa gtctcgcctg aatcttctgt acctttcacc ggcgattgac ttcacttcag 120 
aatcgagaga gaagaaatcg atggaaaact aaaaatagaa agagtttcaa attctcgctc 180 
tctcttcaaa accgcaaatc aagggaacga gagacgagag agagagatga gtaaacggaa 240 
tccgaagatt ctgaagattt ttctgtatat gttacttctc aactctctct ttctcatcat 300 
ctacttcgtt tttcactcat cgtcgttttc accggagcag tcacagcctc ctcatatata 360 
ccacgtttca gtgaataacc aatcggcgat tcagaaaccg tggccgatct taccttctta 420 
cctcccatgg acgccgccgc agaggaatct accaactggc tcctgcgaag gttacttcgg 480 
gaatggattt acaaagagag ttgacttcct taagccgagg attggaggag gaggagaagg 540 
aagctggttc cgatgttttt acagtgagac attacagagt tcgatttgtg aaggaaggaa 600 
tctgagaatg gttccggatc ggattgttat gtcgagagga ggtgagaagt tagaggaagt 660 
tatggggagg aaagaggagg aggagcttcc tgcgtttcga caaggtgcgt ttgaggtagc 720 
ggaagaggtt tcttcacggt taggttttaa gagacaccgt cgttttggtg gaggagaagg 780 
aggtagtgcg gtttctcggc ggctggtgaa tgatgagatg ttgaatgaat atatgcaaga 840 
aggtggaatt gatagacata caatgagaga tttggttgct tcgattcgtg ctgttgatac 900 
caatgatttc gtttgtgaag agtgggtgga ggaaccgacc ttgcttgtca ctagattcga 960 
gtacgcaaat ctcttccata ctgtgacaga ttggtatagt gcctatgttt cgtctagagt 1020 
caccggtttg cctaatcgac ctcacgttgt tttcgttgac ggacactgca cgacgcagct 1080 
agaagaaaca tggacagctt tgttttccgg aatcagatac gcaaagaact tcaccaaacc 1140 
ggtttgtttc cgccacgcga ttctttcacc attgggatac gaaaccgctc tttttaaagg 1200 
cttgtccgga gaaatagact gcaagggaga ttcagctcac aatctgtggc aaaacccgga 1260 
cgataaaagg actgcgagga tatcagagtt tggtgaaatg atcagagcag ctttcgggtt 1320 
gcctgtcaat agacaccgct cattagaaaa gccgctatca tcatcatcat catcagcctc 1380 
agtttataat gttctttttg tccgccgtga agattactta gcccatcctc gtcatggcgg 1440 
taaagtccag tctcggctca tcaatgagga agaagtgttc gactcgttgc atcattgggt 1500 
tgcaactggg tccaccggtc tgaccaaatg cgggattaac cttgtgaatg gcttgcttgc 1560 
acacatgtca atgaaagatc aagtccgagc cattcaagat gcttcagtga tcataggagc 1620 
tcatggagca ggactgactc acattgtctc tgcaacacca aacacaacga tatttgagat 1680 
aataagcgtc gagtttcaga gacctcattt cgagcttata gctaagtgga aaggattgga 1740 
gtatcacgcg atgcatctgg cgaactcacg agcggaacca acggctgtga ttgagaagtt 1800 
aacggagatc atgaagagcc ttggctgcta a 



1831 



<210> 9 
<211> 533 
<212> PRT 

<213> Arabidopsis thaliana 



3 
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<400> 9 

Met Ser Lys Arg Asn Pro Lys lie Leu Lys lie Phe Leu Tyr Met Leu 
1 5 10 15 

Leu Leu Asn Ser Leu Phe Leu lie lie Tyr Phe Val Phe His Ser Ser 
20 25 30 

Ser Phe Ser Pro Glu Gin Ser Gin Pro Pro His lie Tyr His Val Ser 
35 40 45 

Val Asn Asn Gin Ser Ala lie Gin Lys Pro Trp Pro He Leu Pro Ser 
50 55 60 

Tyr Leu Pro Trp Thr Pro Pro Gin Arg Asn Leu Pro Thr Gly Ser Cys 
65 70 75 80 

Glu Gly Tyr Phe Gly Asn Gly Phe Thr Lys Arg Val Asp Phe Leu Lys 
85 90 95 

Pro Arg He Gly Gly Gly Gly Glu Gly Ser Trp Phe Arg Cys Phe Tyr 
100 105 no 

Ser Glu Thr Leu Gin Ser Ser He Cys Glu Gly Arg Asn Leu Arg Met 
115 120 125 

Val Pro Asp Arg He Val Met Ser Arg Gly Gly Glu Lys Leu Glu Glu 
130 135 i4 0 

Val Met Gly Arg Lys Glu Glu Glu Glu Leu Pro Ala Phe Arg Gin Gly 
145 W0 155 ~ 160 

Ala Phe Glu Val Ala Glu Glu Val Ser Ser Arg Leu Gly Phe Lys Arg 
165 170 175 

His Arg Arg Phe Gly Gly Gly Glu Gly Gly Ser Ala Val Ser Arg Arg 
180 185 igo 

Leu Val Asn Asp Glu Met Leu Asn Glu Tyr Met Gin Glu Gly Gly He 
195 200 205 

Asp Arg His Thr Met Arg Asp Leu Val Ala Ser He Arg Ala Val Asp 
210 215 220 

Thr Asn Asp Phe Val Cys Glu Glu Trp Val Glu Glu Pro Thr Leu Leu 
225 230 235 240 

Val Thr Arg Phe Glu Tyr Ala Asn Leu Phe His Thr Val Thr Asp Trp 
245 250 255 



4 
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Tyr Ser Ala Tyr Val Ser Ser Arg Val Thr Gly Leu Pro Asn Arg Pro 
260 265 270 

His Val Val Phe Val Asp Gly His Cys Thr Thr Gin Leu Glu Glu Thr 
27 5 280 285 

Trp Thr Ala Leu Phe Ser Gly He Arg Tyr Ala Lys Asn Phe Thr Lys 
290 295 300 

Pro Val Cys Phe Arg His Ala He Leu Ser Pro Leu Gly Tyr Glu Thr 
305 310 315 320 

Ala Leu Phe Lys Gly Leu Ser Gly Glu He Asp Cys Lys Gly Asp Ser 
325 330 335 

Ala His Asn Leu Trp Gin Asn Pro Asp Asp Lys Arg Thr Ala Arg He 
340 345 350 

Ser Glu Phe Gly Glu Met He Arg Ala Ala Phe Gly Leu Pro Val Asn 
355 360 3 6 5 

Arg His Arg Ser Leu Glu Lys Pro Leu Ser Ser Ser Ser Ser Ser Ala 
370 375 3 8 o 

Ser Val Tyr Asn Val Leu Phe Val Arg Arg Glu Asp Tyr Leu Ala His 
385 3 9<> 395 400 

Pro Arg His Gly Gly Lys Val Gin Ser Arg Leu He Asn Glu Glu Glu 
405 4io 415 

Val Phe Asp Ser Leu His His Trp Val Ala Thr Gly Ser Thr Gly Leu 
420 425 430 

Thr Lys Cys Gly He Asn Leu Val Asn Gly Leu Leu Ala His Met Ser 
435 440 445 

Met Lys Asp Gin Val Arg Ala He Gin Asp Ala Ser Val He He Gly 
450 455 460 

Ala His Gly Ala Gly Leu Thr His He Val Ser Ala Thr Pro Asn Thr 
465 4 ™ 475 480 

Thr He Phe Glu He He Ser Val Glu Phe Gin Arg Pro His Phe Glu 
4 85 490 495 

Leu He Ala Lys Trp Lys Gly Leu Glu Tyr His Ala Met His Leu Ala 
5 °0 505 5io 



5 
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Asn Ser Arg Ala Glu Pro Thr Ala Val He Glu Lys Leu Thr Glu lie 
515 520 525 



Met Lys Ser Leu Gly Cys 
530 



<210> 10 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
<400> 10 

aaccatctcg caaataaata agta 

<210> 11 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : primer 
<400> 11 

gtcgggttta acattacgga tttc 



24 



24 
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